Skip to content

Guard Deformable DETR and LW-DETR Hungarian matchers against infinite cost matrices#47066

Open
MutugiD wants to merge 1 commit into
huggingface:mainfrom
MutugiD:fix-remaining-detr-matcher-infinite-cost
Open

Guard Deformable DETR and LW-DETR Hungarian matchers against infinite cost matrices#47066
MutugiD wants to merge 1 commit into
huggingface:mainfrom
MutugiD:fix-remaining-detr-matcher-infinite-cost

Conversation

@MutugiD

@MutugiD MutugiD commented Jul 4, 2026

Copy link
Copy Markdown

CI

What does this PR do?

Fixes #47065. Two more Hungarian matchers share the non-finite-cost crash first reported for RT-DETR in #47000 and still carry the unguarded scipy.optimize.linear_sum_assignment(cost_matrix) pattern on main:

  • DeformableDetrHungarianMatcher in src/transformers/loss/loss_deformable_detr.py
  • LwDetrHungarianMatcher in src/transformers/loss/loss_lw_detr.py

Both use the sigmoid focal classification cost, which can overflow to inf/NaN under AMP/fp16; linear_sum_assignment then raises ValueError: cost matrix is infeasible. This PR replaces non-finite costs with torch.finfo(cost_matrix.dtype).max before matching, following @guarin's recommendation in #47000 and matching the RT-DETR fix in #47016:

max_value = torch.finfo(cost_matrix.dtype).max
cost_matrix = torch.nan_to_num(cost_matrix, nan=max_value, posinf=max_value, neginf=max_value)

This completes the sweep started in #47048 (DETR + Grounding DINO) across the remaining DETR-family loss matchers.

Tests

Adds a CPU-only regression test for each matcher:

  • tests/models/deformable_detr/test_modeling_deformable_detr.py::DeformableDetrHungarianMatcherInfeasibleCostTest
  • tests/models/lw_detr/test_modeling_lw_detr.py::LwDetrHungarianMatcherInfeasibleCostTest

Each feeds a non-finite cost (a NaN logit and an inf box coordinate) through the matcher and asserts it no longer raises and returns correctly-sized assignments. Both fail on main (with the ValueError) and pass with the guard.

pytest tests/models/deformable_detr/test_modeling_deformable_detr.py::DeformableDetrHungarianMatcherInfeasibleCostTest \
       tests/models/lw_detr/test_modeling_lw_detr.py::LwDetrHungarianMatcherInfeasibleCostTest

Before submitting

Who can review?

@guarin @yonigozlan

… cost matrices

Under AMP/fp16, the sigmoid focal classification cost can overflow to inf/NaN,
which makes scipy's linear_sum_assignment raise "ValueError: cost matrix is
infeasible". Replace non-finite costs with torch.finfo(dtype).max before
matching, matching the RT-DETR fix in huggingface#47016. Adds regression tests for both
matchers.

Fixes huggingface#47065.
@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: deformable_detr, lw_detr

@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

CI recap

Dashboard: View test results in Grafana
Latest run: 28704613947:2
Result: failure | Jobs: 15 | Tests: 171,305 | Failures: 6 | Duration: 24h 14m

@MutugiD

MutugiD commented Jul 5, 2026

Copy link
Copy Markdown
Author

The tests_non_model failure here is unrelated to this PR. All 6 failing tests are CLI tests failing with the same error:

tests/cli/test_chat.py::test_help
tests/cli/test_download.py::test_cli_download
tests/cli/test_download.py::test_cli_download_trust_remote
tests/cli/test_serve.py::test_host_port_blocking
tests/cli/test_system.py::test_cli_env
tests/cli/test_system.py::test_cli_version
-> AttributeError: 'HFCliTyperGroup' object has no attribute '_add_completion'

This is the huggingface_hub 1.22 CLI breakage tracked in #47059 and fixed by #47064; it currently affects main and every open PR.

This PR only touches the two loss matchers and their tests — the tests_torch shards that run the new regression tests are green. I'll rebase onto main once #47064 is merged so the full run goes green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DeformableDetrHungarianMatcher and LwDetrHungarianMatcher crash on non-finite cost matrices ("cost matrix is infeasible")

1 participant